CHAPTER 11 Comparing Average Values between Groups 143
Comparing the mean of two
groups of numbers
Comparing the mean of two groups of numbers is probably the most common
situation encountered in biostatistics. You may be comparing mean levels of a
protein that is a hypothesized disease biomarker between a group of patients
known to have the disease and a group of healthy controls. Or, you may be com-
paring a measurement of drug efficacy between two groups of patients with the
same condition who are taking two different drugs. Or, you may be comparing
measurements of breast cancer treatment efficacy in women on one health insur-
ance plan compared to those on another health insurance plan.
Such comparisons are generally handled by the famous unpaired or “independent
sample” Student t test (usually just called the t test) that we describe later in the sec-
tion “Surveying Student t tests.” Importantly, the t test is based on two assump-
tions about the distribution of the measurement value being tested in the two
groups:»
» The values must be normally distributed (called the normality assumption).
For data that are not normally distributed, instead of the t-test, you can use
the nonparametric Wilcoxon Sum-of-Ranks test (also called the Mann-Whitney U
test and the Mann-Whitney test). We demonstrate the Wilcoxon Sum-of-Ranks
test later in this chapter in the section “Running nonparametric tests.”»
» The standard deviation (SD) of the values must be close for both groups
(called the equal variance assumption). As a reminder, the SD is the square root
of the variance. To remember why accounting for variation is important in
sampling, review Chapter 3. Also, Chapter 9 provides more information about
the importance of SD. If the two groups you are comparing have very different
SDs, you should not use a Student t test, because it may not give reliable
results, especially if you are also comparing groups of different sizes. A rule of
thumb is that one group’s SD divided by another group’s SD should not be
more than 1.5 to quality for a Student t test. If you feel your data do not qualify,
you can use an alternative called the Welch test (also called the Welch t test, or
the unequal-variance t test). As you see later in this chapter under “Surveying
Student t tests,” because the Welch test accounts for both equal and unequal
variance, it is the only one that is included in R statistical software.
Comparing the means of three or
more groups of numbers
Comparing the means of three or more groups of numbers is an obvious extension
of the two-group comparison in the preceding section. For example, you may have